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Abstract — In many compressive sensing problems today, the 
relationship between the measurements and the unknowns could 
be nonlinear. Traditional treatment of such nonlinear relation- 
ships have been to approximate the nonlinearity via a linear 
model and the subsequent un-modeled dynamics as noise. The 
ability to more accurately characterize nonlinear models has the 
potential to improve the results in both existing compressive 
sensing applications and those where a linear approximation 
does not suffice, e.g., phase retrieval. In this paper, we extend 
the classical compressive sensing framework to a second-order 
Taylor expansion of the nonlinearity. Using a lifting technique 
and a method we call quadratic basis pursuit, we show that 
the sparse signal can be recovered exactly when the sampling 
rate is sufficiently high. We further present efficient numerical 
algorithms to recover sparse signals in second-order nonlinear 
systems, which are considerably more difficult to solve than their 
linear counterparts in sparse optimization. 

I. Introduction 

Consider the problem of finding the sparsest signal x 
satisfying a system of linear equations: 

min ||x||o 

subj. to yi = bjx, yi e M, bi e K n , i = 1, . . . , TV 

This problem is known to be combinatorial and NP-hard [2] 
and a number of approaches to approximate its solution have 
been proposed. One of the most well known approaches is to 
relax the zero norm and replace it with the i?i-norm: 

min || a; ||i subj. to yi~bjx, i = l,...,N. (2) 

This approach is often referred to as basis pursuit (BP) [3]. 

The ability to recover the optimal solution to (1) is essential 
in the theory of compressive sensing (CS) [4], [5] and a 
tremendous amount of work has been dedicated to solving and 
analyzing the solution of (1) and (2) in the last decade. Today 
CS is regarded as a powerful tool in signal processing and 
widely used in many applications. For a detailed review of the 
literature, the reader is referred to several recent publications 
such as [6], [7]. 

Ohlsson, Yang, Dong, and Sastry are with the Department of Electrical 
Engineering and Computer Sciences, University of California, Berkeley, CA, 
USA. Ohlsson is also with the Division of Automatic Control, Department of 
Electrical Engineering, Linkoping University, Sweden. Verhaegen is with the 
Delft Center for Systems and Control, Delft University, Delft 2628CD, The 
Netherlands. Corresponding author: Henrik Ohlsson, Cory Hall, University of 
California, Berkeley, CA 94720. Email: ohlsson@eecs.berkeley.edu. 

The authors gratefully acknowledge support by the Swedish Research 
Council in the Linnaeus center CADICS, the European Research Council 
under the advanced grant LEARN, contract 267381, by a postdoctoral grant 
from the Sweden-America Foundation, donated by ASEA's Fellowship Fund, 
by a postdoctoral grant from the Swedish Research Council, and by ARO 
63092-MA-II and DARPA FA8650-1 1-1-7153. 

This paper was presented in part at NIPS 2012, Lake Tahoe, USA, Dec 
3-6, 2012, [1]. 



It has recently been shown that CS can be extended to 
nonlinear models. More specifically, the relatively new topic 
of nonlinear compressive sensing (NLCS) deals with a more 
general problem of finding the sparsest signal a; to a nonlinear 
set of equations: 

min 1 1 co 1 1 o 
subj. to yi = fi(x), ^61,1=1,..., TV, 

where each fa : R n — > R is a continuously differentiable 
function. Compared to CS, the literature on NLCS is still 
very limited. The interested reader is referred to [8], [9] and 
references therein. 

In this paper, we will restrict our attention from rather gen- 
eral nonlinear systems, and instead focus on nonlinearities that 
depends quadratically on the unknown x. More specifically, 
we consider the following problem formulated in the complex 
domain: 

min || a; || o 

subj. to yi = m + b"x + x H Ci + x h Q { x, (4) 
i = l,...,N, 

where a, e C, b,,c t e C", y t G C, and Q t € C nxn , i = 
1, ... ,7V. In a sense, being able to solve (4) would make it 
possible to apply the principles of CS to a second-order Taylor 
expansion of the nonlinear relationship in (3), while traditional 
CS mainly considers its linear approximation or first-order 
Taylor expansion. In particular, in the most simple case, when 
a second order Taylor expansion is taken around zero (i.e., a 
Maclaurin expansion), let a, = fi(0), bi = Ci = Vj/j(0)/2 
and Q l = V£/i(0)/2, i = 1.....JV, with V x and V 2 X 
denoting the gradient and Hessian with respect to x. In this 
case, Q is a Hermitian matrix. Nevertheless, we note that our 
derivations in the paper does not depend on the matrix Q to 
be symmetric in the real domain or Hermitian in the complex 
domain. 

In another motivating example, we consider the well-known 
phase retrieval problem in x-ray crystallography, see for 
instance [10], [11], [12], [13], [14], [15]. The underlying 
principal of x-ray crystallography is that the information about 
the crystal structure can be obtained from its diffraction pattern 
by hitting the crystal by an x-ray beam. Due to physical lim- 
itations, typically only the intensity of the diffraction pattern 
can be measured but not its phase. This leads to a nonlinear 
relation 

Vi = \a"x\ 2 = x^a^x, i = l,...,N, (5) 

between the measurements yi,. ■ . ,Un € M and the structural 
information contained in x e C™. The complex vectors 
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ai,...,ajv € C" are known and H denotes the conjugate 
transpose. The mathematical problem of recovering x from 
yi , . . . , y N , and a\ , . . . , ajv is referred to as the phase retrieval 
problem. The traditional phase retrieval problem is known to 
be combinatorial [16]. 

If x is sparse under an appropriate basis in (5), the problem 
is referred to as compressive phase retrieval (CPR) in [17], 
[18] or quadratic compressed sensing (QCS) in [19]. These 
algorithms can be applied to several important imaging ap- 
plications, such as diffraction imaging [20], astronomy [21], 
[22], optics [23], x-ray tomography [24], microscopy [25], 
[26], [27], and quantum mechanics [28], to mention a few. 
As we will later show, our solution as a convex relaxation 
of (4), called quadratic basis pursuit (QBP), can be readily 
applied to solve this type of problems, namely, let = bi = 
Ci = 0, Q l = dirf, i = l,...,N. 

A. Contributions 

The main contribution of this paper is a novel convex 
technique for solving the sparse quadratic problem (4), namely, 
QBP. The proposed framework is not a greedy algorithm and 
inherits desirable properties, e.g., perfect recovery, from BP 
and the traditional CS results. In comparison, most of the 
existing solutions for sparse nonlinear problems are greedy 
algorithms, and therefore their ability to give global conver- 
gence guarantees is limited. 

Another contribution is an efficient numerical algorithm that 
solves the QBP problem and compares favorably to other ex- 
isting sparse solvers in convex optimization. The algorithm is 
based on alternating direction method of multipliers (ADMM). 
Applying the algorithm to the complex CPR problem, we 
show that the QBP approach achieves the state-of-the-art 
result compared to other phase retrieval solutions when the 
measurements are under-sampled. 

In Section II, we will first develop the main theory of QBP. 
In Section III, we present the ADMM algorithm. Finally, in 
Section IV, we conduct comprehensive experiments to validate 
the performance of the new algorithm on both synthetic and 
more practical imaging data. 

B. Literature Review 

To the best of our knowledge, this paper is the first work 
focusing on recovery of sparse signals from systems of general 
quadratic equations. Overall, the literature on nonlinear sparse 
problems and NLCS is also very limited. One of the first 
papers discussing these topics is [29]. They present a greedy 
gradient based algorithm for estimating the sparsest solution to 
a general nonlinear equation system. A greedy approach was 
also proposed in [30] for the estimation of sparse solutions 
of nonlinear equation systems. The work of [8] proposed 
several iterative hard-thresholding and sparse simplex pursuit 
algorithms. As the algorithms are nonconvex greedy solutions, 
the analysis of the theoretical convergence only concerns about 
their local behavior. In [9], the author also considered a gener- 
alization of the restricted isometry property (RIP) to support 
the use of similar iterative hard-thresholding algorithms for 
solving general NLCS problems. 



Our paper is inspired by several recent works on CS applied 
to the phase retrieval problem [17], [31], [32], [19], [18], [33], 
[27], [34], [35], [36]. In particular, the generalization of com- 
pressive sensing to CS was first proposed in [17]. In [19], the 
problem was also referred to as QCS. These methods typically 
do not consider a general quadratic constraint as in (4) but a 
pure quadratic form (i.e., a, = 6j = Ci = 0, i = 1, . . . , N, in 
(4)). 

In terms of the numerical algorithms that solves the CPR 
problem, most of the existing methods are greedy algorithms, 
where a solution to the underlying non-convex problem is 
sought by a sequence of local decisions [17], [31], [19], [33], 
[27], [36]. In particular, the QCS algorithm in [19] used a 
lifting technique similar to that in [37], [38], [39], [40] and 
iterative rank minimization resulting in a series of semidefinite 
programs (SDPs) that would converge to a local optimum. 

The first work that applied the lifting technique to the 
PR and CPR problems was presented in [32]. Extensions 
of similar techniques were also studied in [41], [34]. The 
methods presented in our previous publications [1], [18] were 
also based on the lifting technique. It is important to note 
that the algorithms proposed in [32], [1], [18] are non-greedy 
global solutions, which are different from the previous local 
solutions [17], [19]. Our work was inspired by the solutions 
to phase retrieval via low-rank approximation in [32], [16], 
[42]. Given an oversampled phase retrieval problem, a lifting 
technique was used to relax the nonconvex problem with a 
SDP The authors of [16], [42] also derived an upper-bound for 
the sampling rate that guarantees exact recovery in the noise- 
free case and stable recovery in the noisy case. Nevertheless, 
the work in [16], [42] only addressed the oversampled phase 
retrieval problem but not CPR or NLCS. The only similarities 
between our work and theirs are the lifting technique and 
convex relaxation. This lifting technique has also been used 
in other topics to convert nonconvex quadratic problems to 
SDPs, see for instance [43], [34]. The work presented in [32] 
and our previous contributions [1], [18] only discussed the 
CPR problem. 

Finally, in [35], a message passing algorithm similar to that 
in CS was proposed to solve the compressive phase retrieval 
problem. The work in [44] further considered stability and 
uniqueness in real phase retrieval problems. CPR has also been 
shown useful in practice and we refer the interested reader to 
[17], [27] for two very nice contributions. Especially fascinat- 
ing we find the work presented in [27] where the authors show 
how CPR can be used to facilitate sub-wavelength imaging in 
microscopy. 

C. Notation and Assumptions 

In this paper, we will use bold face to denote vectors and 
matrices and normal font for scalars. We denote the transpose 
of a real vector by x T and the conjugate transpose of a 
complex vector by x H . Xij is used to denote the (i, j)th 
element, Xj >: the zth row and X : j the jth column of a 
matrix X, respectively. We will use the notation X ii:i2 j i: j 2 
to denote a submatrix constructed from rows i\ to i 2 and 
columns j\ to j 2 of X. Given two matrices X and Y, 



3 



we use the following fact that their product in the trace 
function commutes, namely, Tr(XY) = Tr(YX), under 
the assumption that the dimensions match. j| • || counts the 
number of nonzero elements in a vector or matrix; similarly, 
|| • ||i denotes the element- wise ^i-norm of a vector or matrix, 
i.e., , the sum of the magnitudes of the elements; whereas || • | 
represents the ^ 2 -norm for vectors and the spectral norm for 
matrices. 



II. Quadratic Basis Pursuit 
A. Convex Relaxation via Lifting 

As optimizing the ^o- norm function in (4) is known to be 
a combinatorial problem, in this section, we first introduce a 
convex relaxation of (4). 

It is easy to see that the general quadratic constraint of (4) 
can be rewritten as the quadratic form: 



Vi= [1 



x' 







l 






X 



2 = 1, 



,N. 



Since each y,- L is a scalar, we further have 
yi=Tr([l x"] 



r »,h- 




l 






X 



= Tr 



Define <I>, = 



c, 



^ Q 
andX = 



[1 



x' 



(6) 

(7) 
(8) 



[l a; H ] , both matrices 

of dimensions (n+1) x (n + 1). The operation that constructs 

X from the vector is known as the lifting operator [37], 

[38], [39], [40]. By definition, X is a Hermitian matrix, and 
it satisfies the constraints that X\ t \ = 1 and rank(X) = 1. 
Hence, (4) can be rewritten as 



min x H^llo 
subj. to y i =T[(& i X), i = l,...,N, 
rank(X) = 1, X hl = 1, X h 0. 



When the optimal solution X* is found, the unknown x can 
be obtained by the rank-1 decomposition of X* via singular 
value decomposition (SVD). 

The above problem is still non-convex and combinatorial. 
Therefore, solving it for any moderate size of n is impractical. 
Inspired by recent literature on matrix completion [45], [32], 
[16], [42] and sparse PC A [46], we relax the problem into the 
following convex semidefinite program (SDP): 

min x Tr(X) + A||X|| 1 
subj. to yi=Tr($iX), i = l,...,N, (10) 
X M = 1, XhO, 

where A > is a design parameter. In particular, the trace of 
X is a convex surrogate of the low-rank condition and ||X||i 
is the well-known convex surrogate for ||X|| in (9). We refer 
to the approach as quadratic basis pursuit (QBP). 

One can further consider a noisy counterpart of the QBP 
problem, where some deviation between the measurements and 
the estimates is allowed. More specifically, we propose the 



following quadratic basis pursuit denoising (QBPD) problem: 

minx Tr(X) + A||X|| 1 
subj. to \\yi-Tr(*iX)f<e, (11) 

Xi,i = l, XhO, 

for some e > 0. 

B. Theoretical Analysis 

In this section, we highlight some theoretical results derived 
for QBP. The analysis follows that of CS, and is inspired 
by derivations given in [16], [4], [32], [5], [47], [48], [6]. 
For further analysis on special cases of QBP and its noisy 
counterpart QBPD, please refer to [18]. 

First, it is convenient to introduce a linear operator B: 



B : X e 



^ {Tri&iX)}^^ e 



(12) 



We consider a generalization of the restricted isometry prop- 
erty (RIP) of the linear operator B. 

Definition 1 (RIP). A linear operator £?(•) as defined in (12) 
is (e, k)-RIP if 

■\B{XW 



- 1 



< e 



(13) 



IWI 2 

for all \\X\\q < k and 1/0. 

We can now state the following theorem: 

Theorem 2 (Recoverability/Uniqueness). Let x e C™ be 

a solution to (4). // X* G c(" +1 ) x (™ +1 ) satisfies y = 
B(X*), X* h 0, rank(X*) = 1, X* l x = 1 and if B(-) is a 
(e,2\\X*\\ )-RIP linear operator with e < 1 then X* and x 
are unique and X* 2 . n+l l = x. 

Proof: Assume the contrary i.e., X^n+i.i 7^ * an d nence 



that X* ^ 



[1 x H ]. It is clear that 



(9) || -XT* || o and hence 





T 




x 



Since 





"1" 




X 



[1 



x 1 



- X* 



[1 x H ]-X* 



[1 



X' 



< 



< 2||X*|| . 



< 2||X*|| , we can apply 



the RIP inequality (13) on 



use that y 



B 





"1" 


( 


X 


< 


e. 



[1 



X' 



B(X*) 

1 -x 



B 



[1 



[1 



x 



X*. If we 
and hence 



0, we are led to the contradiction 



1 < e. We therefore conclude that X* = 

X2- n+ i i = x and that X* and x are unique. 
We can also give a bound on the sparsity of x: 



[1 



x' 



Theorem 3 (Bound on ||^|| from above). Let x be the 

sparsest solution to (4) and let X be the solution of QBP 
(10). If X has rank 1 then ||-X"2 :n +i,i||o > ll^llo- 

Proof: Let X be a rank-1 solution of QBP (10). By 
contradiction, assume ||X 2:n+ i ! i||o < ||*||o- Since X 2: „+i j i 
satisfies the constraints of (4), it is a feasible solution of (4). As 
assumed, X 2 : n +i,i also gives a lower objective value than x in 
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(4). This is a contradiction since x was assumed to be the so- 
lution of (4). Hence we must have that ||X 2:n+i,i||o > ll^llo- 

■ 

The following result now holds trivially: 

Corollary 4 (Guaranteed recovery using RIP). Let x be the 

sparsest solution to (4). The solution of QBP X is equal to 

[l x H ] if it has rank 1 and B(-) is (e, 2\\X\\ )-RIP with 



to 



x 
i < 1. 

Proof: This follows trivially from Theorem 2 by realizing 
that X satisfy all properties of X*. ■ 
Given the RIP analysis, it may be that the linear operator 
B(-) does satisfy the RIP property defined in Definition 1 with 
a small enough e, as pointed out in [16]. In these cases, RIP-1 
may be considered: 

Definition 5 (RIP-1). A linear operator B(-) is (e, k)-RIP-l 
if 

'\B{X)h 

\\X\\i 



- 1 



< e 



(14) 



for all matrices 1^0 and ||X|| < k. 



Theorems 2-3 and Corollary 4 all hold with RIP replaced 
by RIP-1 and will not be restated in detail here. Instead, 
we summarize the most important property in the following 
theorem: 

Theorem 6 (Upper bound and recoverability using RIP-1). 

Let x be the sparsest solution to (4). The solution of QBP 



(10), X, is equal to 



[1 



x H ] if it has rank 1 and B(-) 



is (e,2\\X\\ )-RIP-l with e < 1. 

Proof: The proof follows trivially from the proof of 
Theorem 2. ■ 
The RIP-type argument may be difficult to check for a given 
matrix and are more useful for claiming results for classes 
of matrices/linear operators. For instance, it has been shown 
that random Gaussian matrices satisfy the RIP with high 
probability. However, given realization of a random Gaussian 
matrix, it is indeed difficult to check if it actually satisfies the 
RIP. Two alternative arguments are the spark condition [3] and 
the mutual coherence [49], [50]. The spark condition usually 
gives tighter bounds but is known to be difficult to compute 
as well. On the other hand, mutual coherence may give less 
tight bounds, but is more tractable. We will focus on mutual 
coherence, which is defined as: 

Definition 7 (Mutual coherence). For a matrix A, define the 
mutual coherence as 

|A H 4 A.,| 

fi{A) — max — — — ttt^ ~~ — 77. (15) 

l<i,j<n,i^j || A^i\\ || A. j || 

Let B be the matrix satisfying y = BX S = B(X) with 
X s being the vectorized version of X. We are now ready to 
state the following theorem: 

Theorem 8 (Recovery using mutual coherence). Let x be the 

sparsest solution to (4). The solution of QBP (10), X, is equal 



[l x H ] if it has rank 1 and \\X\\ Q < 0.5(1 + 1/ /u(B)). 



Proof: It follows from [49] [6, Thm. 5] that if 



mo<1 2V + m 



(16) 



then X is the sparsest solution to y = B(X). Since 
[l S H ] is by definition the sparsest rank 1 solution to 



x 



y = B(X), it follows that X 



[1 x"]. 



III. Numerical Algorithms 

In addition to the above analysis of guaranteed recovery 
properties, a critical issue for practitioners is the efficiency of 
numerical solvers that can handle moderate-sized SDP prob- 
lems. Several numerical solvers used in CS may be applied to 
solve nonsmooth SDPs, which include interior-point methods, 
e.g., used in CVX [51], gradient projection methods [52], 
and augmented Lagrangian methods (ALM) [52]. However, 
interior-point methods are known to scale badly to moderate- 
sized convex problems in general. Gradient projection methods 
also fail to meaningfully accelerate QBP due to the complexity 
of the projection operator. Alternatively, nonsmooth SDPs can 
be solved by ALM. However, the augmented primal and dual 
objective functions are still SDPs, which are equally expensive 
to solve in each iteration. There also exist a family of iterative 
approaches, often referred to as outer approximation methods, 
that successively approximate the solution of an SDP by 
solving a sequence of linear programs (see [53]). These 
methods approximate the positive semidefinite cone by a set of 
linear constraints and refine the approximation in each iteration 
by adding a new set of linear constraints. However, we have 
experienced slow convergence using these type of methods. 
In summary, QBP as a nonsmooth SDP is categorically more 
expensive to solve compared to the linear programs underlying 
CS, and the task exceeds the capability of many popular sparse 
optimization techniques. 

In this paper, we propose a novel solver to the nonsmooth 
SDP underlying QBP via the alternating directions method of 
multipliers (ADMM, see for instance [54] and [55, Sec. 3.4]) 
technique. The motivation to use ADMM is two-fold: 

1) It scales well to large data sets. 

2) It is known for its fast convergence. 

There are also a number of strong convergence results which 
further motivates the choice [54]. 

To set the stage for ADMM, let n denote the dimension 
of x, and let N denote the number of measurements. Then, 
rewrite (10) to the equivalent SDP 

niin Xl ,x 2 ,z f 1 (X 1 ) + f 2 (X 2 )+g(Z), 

subj. to X 1 -Z = 0, X 2 - Z = 0, 

where X 1 = X^ e C(" +1 ) x (™ +1 ), X 2 = X^ e 



(17) 
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( [-<(n+i)x(n+i) ) z = Z H e c(™ +1 ) x ( n+1 ), and 

'Tr(X) \£y i = Ti(* i X),i = l, 



h(X) 




and X^i = 1 
otherwise 



if X y 
otherwise 



Define two matrices Y"i and Y 2 as the Lagrange multipliers 
of the two equality constraints in (17), respectively. Then the 
update rules of ADMM lead to the following: 

X\ +1 = argmin x=X H/ i (X)+Tr(^(X-Z i )) 

+ fpr-s'll 2 , 

= argmin z=Z H g(Z) + ELi Tr(^Z) 
ZIP, 



z i+i 
Y l+1 



f||X< +1 
I^+p(X 



rl+l 



(18) 

for z = 1,2, where /? > is a parameter that enforces 
consensus between Xi, X 2 , and Z. Each of these steps has 
a tractable calculation. After some simple manipulations, we 
have: 



x<- 



l+i 



argmin x=X H 
subj. to 



(tf-^)ll, 



Let B : C ( " +1)x( ™ +1) 
operator such that B(X) 



\X lr7X 

= Tr^X), i = l,...,N, 
X M = 1. 

(19) 

-» C JV+1 be the augmented linear 
_ \B{X) 



operator defined by (12). Assuming 



, where B is the linear 
that a feasible solution 



exists, and defining Ilg as the orthogonal projection onto the 
convex set given by the linear constraints, i.e., = B{X), 

the solution is: X[ +1 =Tl s (Z l - i ±p L ). This matrix-valued 
problem can be solved by converting the linear constraint on 
Hermitian matrices into an equivalent constraint on real-valued 
vectors. 
Next, 



X 2 +1 = argmin 



X 



= n PSD [z l --^\ 
(20) 

where TIpsd denotes the orthogonal projection onto the 
positive-semidefinite cone, which can easily be obtained via 
eigenvalue decomposition. 

Finally, let X~' +1 = \ J^ 2 l=1 X\ +1 and similarly Y* . Then, 
the Z update rule can be written: 



Z l+1 = argmin z=Z T A||Z||i + p\\Z - (X 

A 

2p 



sort (A + 



■i+i 

p 



(21) 

where soft(-) in the complex domain is defined with respect 
to a positive real scalar q as: 



soft(x, q) 







if \x\ < q, 
otherwise. 



(22) 



Note that if the first argument is a complex value, the soft 
operator is defined in terms of the magnitude rather than the 
sign and if it is a matrix, the the soft operator acts element- 
wise. 

Setting I = 1,X[ = X l 2 = Z l = I, where I denotes 
the identity matrix, and p l = 1, setting I = 0, the Hermitian 
matrices X t +1 , Z^ 1 , Y^ 1 can now be iteratively computed 
using the ADMM iterations (18). The stopping criterion of the 
algorithm is given by: 



<ne 



abs 



abs 



e rel maxf 



jrel I 



IX' 



(23) 
(24) 



where e abs 7 e rel are algorithm parameters set to 10 3 and r l 
and s l are the primal and dual residuals, respectively, as: 

r l = [X[ -Z l X l 2 - Z l ] , (25) 
s l =- p [Z l - Z'- 1 Z l - Z 1 - 1 ] . (26) 

We also update p according to the rule discussed in [54]: 

if ||r'|| > fJ,\\s l \\, 
if\\s l \\>p\\r l \\, 



J+i _ 



= < 



TincrP 
P l /T d ec 



(27) 



otherwise, 



where r,, 



Tde 



and p are algorithm parameters. Values 
commonly used are p = 10 and Ti ncr = Td eC r = 2. 

In terms of the computational complexity of the ADMM 
algorithm, its inner loop calculates the updates of Xj, Z, and 
Y i, i = 1, 2. It is easy to see that its complexity is dominated 
by (19) and (20), which is bounded by 0(n 2 N 2 + n 3 ), while 
the calculation of Z and Y \ is linear with respect to the 
number of their elements. 

IV. Experiments 

In this section, we provide comprehensive experiments to 
validate the efficacy of the QBP algorithms in solving several 
representative nonlinear CS which depends quadratically on 
the unknown. We compare their performance primarily with 
two existing algorithms. As we mentioned in Section I, if 
an underdetermined nonlinear system is approximated up to 
the first order, the classical sparse solver in CS is basis 
pursuit. In NLCS literature, several greedy algorithms have 
been proposed for nonlinear systems. In this section, we 
choose to compare with the iterative hard thresholding (IHT) 
algorithm in [8] in Section IV-A and another greedy algorithm 
demonstrated in [27] in Section IV-C. 1 

A. Nonlinear Compressive Sensing in Real Domain 

In this experiment, we illustrate the concept of nonlinear 
compressive sensing. Assume that there is a cost associated 
with sampling and that we would like to recover Zq G R m , 
related to our samples j/j £ R, i = 1, . . . , N, via 



yi = fi(z ), i = l,... 



,N, 



(28) 



'Besides the comparisons shown here, we have also compared to a number 
of CPR algorithms [17], [36]. Not surprisingly, they performed badly on the 
general quadratic problems since they do not account for the linear term. 
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using as few samples as possible. Also, assume that there is 
a sparsifying basis D e R mxn , possibly overcomplete, such 
that 

z — T)x , with x sparse. (29) 



Hence, we have 



yi = fi(Dx ), i = l,...,N, 



(30) 



with x a sparse vector. If we approximate the nonlinear equa- 
tion system (30) using a second order Maclaurin expansion we 
endup with a set of quadratic equations, 

V 2 f (Q) 

Vl = ,A(0)+V,A(0)D:eo+4D T 2 Da5 °' % = ■ ■ ■ ' N - 

(31) 

Hence, we can use QBP to recover x a given {fi(x), Ui}fLi 
and D. 

In particular, let D = I, n = m = 20, N = 25, fi(x) = 
di + bjx + x J Q { x, i = 1,...,N, and generate {yi}fL 1 by 
sampling {dj, bi, Q i }^. 1 from a unitary Gaussian distribution. 
Let Xq be a binary vector with three elements different than 
zero. Given {t/j, a^, bi, Q^}^, the task is now to recover Xq. 
The results of this simulation are shown in Figure 1. 



(a) Ground truth. 



t T 



t IT ■ 



iL 



(b) QBP with A = 50. 



(c) QBP with A = 0. 



First, as the noiseless measurements are generated by a 
quadratic system of equations, it is not surprising that QBP 
perfectly recovers the sparse signal Xo when A = 50. One 
may wonder whether in the 25-D ambient space, the solution 
x is unique. To show that the solution is not unique, we let 
A = and again apply QBP. As shown in Figure 1 (c), the 
solution is dense and it also satisfies the quadratic constraints. 
Therefore, we have verified that the system is underdetermined 
and there exist multiple solutions. 

Second, in Figure 1 (d), we approximate (31) only up to the 
first order and set Q { = 0,i = 1,...,N. The approximation 
enables us to employ the classical basis pursuit algorithm in 
CS to seek the best 3-sparse estimate x. As expected, the 
approximation is not accurate enough, and the estimate is far 
from the ground truth. 

Third, we implement the iterative hard thresholding (IHT) 
algorithm in [8], and the correct number of nonzero coeffi- 
cients in Xq is also provided to the algorithm. Its estimate 
is given in Figure 1 (e). As IHT is a greedy algorithm, its 
performance is affected by the initialization. In Figure 1 (e), 
the initial value is set by x = 0, and the estimate is incorrect. 

Finally, we note that the advantage of using general CS the- 
ory is that fewer samples are needed to recover a source signal 
from its observations. This remains true for NLCS presented in 
this paper. However, as (28) and (31) are nonlinear equation 
systems, typically N ^> m measurements are required for 
recovering a unique solution. In the same simulation shown 
in Figure 1, one could ignore the sparsity constraint (i.e., , by 
letting A = in Figure 1 (c)), and it would require N' = 40 
observations for QBP to recover the unique solution, which is 
exactly the ground-truth signal. 

Clearly, Figure 1 is only able to illustrate one set of 
simulation results. To more systematically demonstrate the 
accuracy of the four algorithms in probability, a Monte Carlo 
simulation is performed that repeats the above simulation but 
with different randomly generated Xo and {aj, bi, Qj}. Table I 
shows the rates of successful recovery. We can see QBP 
achieves the highest success rate, which is followed by IHT. 
BP and the dense QBP solution basically fail to return enough 
good results. A = 50 was used in all trials. 

TABLE I 

The percentage of correctly recovering x in 100 trials. 



(d) Basis pursuit. 



(e) Iterative hard thresholding. 



Fig. 1. Estimated 20-D sparse signals measured in a simulated quadratic 
system of equations. The QBP solution perfectly recovers the ground truth 
with A = 50, while the remaining algorithms fail to recover the correct sparse 
coefficients. 



Method 


QBP (A = 50) QBP (A = 0) BP IHT 


Success rate 


79% 5% 3% 54% 



B. The Shepp-Logan Phantom 

In this experiment, we consider recovery of images from 
random samples. More specifically, we formulate an example 
of the CPR problem in the QBP framework using the Shepp- 
Logan phantom. Our goal is to show that using the QBPD 
algorithm provides approximate solutions that are visually 
close to the ground-truth images. 
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Consider the ground-truth image in Figure 2. This 30 x 30 
Shepp-Logan phantom has a 2D Fourier transform with 100 
nonzero complex coefficients. We generate N linear com- 
binations of pixels, and then measure the square of the 
measurements. This relationship can be written as: 

y = \Ax\ 2 = {x H a i a"x} 1 < l < N , (32) 

where A = RF is the concatenation of a random matrix R 
and the Fourier basis F, and the image Fa; is represented 
as a stacked vector in the 900-D complex domain. The CPR 
problem minimizes the following objective function: 

min||a;||i subj. to y = \ Ax\ 2 £ R N . (33) 

X 

Previously, an SDP solution to the non sparse phase retrieval 
problem was proposed in [16], which is called PhaseLift. In a 
sense, PhaseLift can be viewed as a special case of the QBP 
solution in (10) where A = 0, namely, the sparsity constraint 
is not enforced. In Figure 2 (b), the recovered result using 
PhaseLift is shown with N = 2400. 

To compare visually the performance of the QBP solution 
when the sparsity constraint is properly enforced, two recov- 
ered results are shown in Figure 2 (c) and (d) with N — 2400 
and 1500, respectively. Note that the number of measurements 
with respect to the sparsity in x is too low for both QBP and 
PhaseLift to perfectly recover x. Therefore, in this case, we 
employ the noisy version of the algorithm QBPD to recover 
the image. Wee can clearly see from the illustrations that 
QBPD provides a much better approximation and outperforms 
PhaseLift visually even though it uses considerably fewer 
measurements. 




(a) Ground truth (b) PhaseLift with N = 2400 




(c) QBPD with N = 2400 (d) QBPD with TV = 1500 



Fig. 2. Recovery of a Shepp-Logan Image by PhaseLift and QBPD. 

C. Subwavelength Imaging 

In this example, we present an example in sub-wavelength 
coherent diffractive imaging. The experiment and the data 
collection were conducted by [27]. 

Let yi, i = 1, . . . , N, be intensity samples of a 2D diffrac- 
tion pattern. The diffraction pattern is the result of a 532 nm 



laser beam passing through an arrangement of holes made on 
a opaque piece of glass. The task is to decide the location of 
the holes out of a number of possible locations. 

It can be shown that the relation between the intensity mea- 
surements and the arrangements of holes is of the following 
type: 

yi = \a?x\ 2 , i=l,...,N, (34) 

where € K, i = 1, . . . , N, are intensity measurements, ctj £ 
C™, i = 1, . . . , N, are known complex vectors and x £ W 1 , 
is the sought entity, each element giving the likelihood of a 
hole at a given location. 

We use QBPD with e = 0.0012 and A = 100. 89 
measurements were selected by taking every 200th intensity 
measurement from the dataset of [27]. The quantity x is 
from the setup of the experiment known to be real and 
dj = bi = Ci = 0. We hence have 

Vl = x T Q t x = \a^x\ 2 , i = l,...,N, (35) 

with Q t = aid? e C nxn , i = 1, . . . , N, and x £ R n . 

The resulting estimate is given to the left in Figure 3. The 
result deviates from the ground truth and the result presented 
in [27] (shown in Figure 3 right), and it actually finds a more 
sparse pattern. It is interesting to note that both estimates are 
however within the noise level estimated in [27]: 

-£fe-k H *| 2 ) 2 <l-8xlO- 6 . (36) 

i 

Therefore, under the same noise assumptions, the two solu- 
tions are equally likely to lead to the same observations y. 
However, knowing that there is a solution within the noise 
level that is indeed sparser than the ground-truth pattern, it 
should not be the optimal solution to have recovered the 
ground truth, since there exists a sparser solution. 




Fig. 3. The estimated sparse vector x. The crosses mark possible positions 
for holes, while the dots represent the recovered nonzero coefficients. Left: 
Recovered pattern by QBPD. Note that this estimate is sparser than the ground 
truth but within the estimated noise level. Right: Recovered pattern by the 
compressive phase retrieval method used in [27]. 

V. Conclusion 

Classical compressive sensing assumes a linear relation 
between samples and the unknowns. The ability to more 
accurately characterize nonlinear models has the potential 
to improve the results in both existing compressive sensing 
applications and those where a linear approximation does not 
suffice, e.g., phase retrieval. 
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This paper presents an extension of classical compressive 
sensing to quadratic relations or second order Taylor ex- 
pansions of the nonlinearity relating measurements and the 
unknowns. The novel extension is based on lifting and convex 
relaxations and the final formulation takes the form of a 
SDR The proposed method, quadratic basis pursuit, inherits 
properties of basis pursuit and classical compressive sensing 
and conditions for perfect recovery etc are derived. We also 
give an efficient numerical implementation. 
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